-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BI-2055] - Remove missingValueString tablesaw values #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favor of merging so we can test with bi-api.
I left a few questions in comments.
// No default missing indicators | ||
// TODO: Allow this to be configurable? | ||
public static final ImmutableList<String> MISSING_INDICATORS = | ||
ImmutableList.of(missingInd1, missingInd2, missingInd4, missingInd5); | ||
ImmutableList.of(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to keep lines 25-31 around in this file at all, if those variables are not used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I was admittedly doing the bare minimum to obtain the desired change in functionality lest I accidentally break something else, but that is something that can be reasonably removed, so just added a commit!
@@ -528,7 +528,7 @@ void testWithMissingValue2() throws IOException { | |||
|
|||
Table t = Table.read().csv("../data/missing_values2.csv"); | |||
assertEquals(1, t.stringColumn(0).countMissing()); | |||
assertEquals(1, t.numberColumn(1).countMissing()); | |||
assertEquals(0, t.numberColumn(1).countMissing()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this change needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test counts missing values, which includes both values in the MISSING_INDICATORS that were turned to an empty string, as well as the empty string itself. Removing values from MISSING_INDICATORS meant that one of the values in the csv was no longer a missing value, and so the count in the test had to be adjusted.
Description
Story: BI-2055 - Remove missingValueString tablesaw values
Default tablesaw behavior silently converts certain values to an empty string when processing tables, which can lead to unexpected behavior when importing files in DeltaBreed. This follows up on work implemented in BI-1993 to also remove this silent conversion for N/A, NaN, *, and null values and update associated tests.
Dependencies
bi-api: bug/BI-2055
Testing
Will be tested in bi-api PR after this is merged, due to difficulties in utilizing tablesaw from a different branch
Checklist: